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1.  Introduction 

A  tree  may  not  bo  the  host  multiprocessor  organization,  but  it  has-lx'on  proposed  by  many  re¬ 
searchers  for  a  variety  of  reasons.  For  example,  a  complete  binary  tree  of  processing  elements  can 
be  the  major  component  of  a  priority  queue  resource  [15]  and  or  a  smart-memory,  raster  graphics 
system  |I0].  A  complete  binary  tree  can  also  serve  as  a  hardware  structure  for  searching  [2],  for 
databases  [29],  or  for  direct  execution  of  applicative  programming  languages  (21].  Drowning  [6] 
proposes  a  complete  binary  tree  for  general-purpose  multiprocessing. 

Attention  is  also  directed  to  binary  trees  which  arc  not  complete.  Floyd  and  Ullman  [8]  show 
that  strings  described  by  a  regular  expression  can  be  recognized  by  processing  elements  organized 
as  the  parse  tree  of  the  regular  expression.  Foster  and  Kung  [9]  have  a  similar  scheme  based  on 
the  simple  configurable  layout  of  Section  3  (first  presented  in  [17]).  There  are  other  proposals, 
for  example  [27],  of  machine  organizations  which,  though  not  trees,  are  nevertheless  tree-like. 

We  shall  not  debate  the  merits  of  the  various  tree  machine  architectures  here,  but  shall 
confine  ourselves  to  understanding  their  physical  organization.  In  this  regard,  one  attraction  of 
trees  is  that  they  can  be  laid  out  efliciently.  Figure  1  shows  the  familiar  H-trcc  layout  originally 
proposed  by  Mead  and  Item  [22].  This  layout  of  a  complete  binary  tree  requires  linear  area,  as 
opposed  to  the  O(nlgn)  area  standard  layout  shown  in  Figure  2.  Leiserson  [16]  and  Valiant  [30] 
independently  discovered  that  arbitrary  binary  trees  could  be  laid  out  in  linear  area.  In  fact, 
Valiant  proved  that  no  crossovers  were  necessary  in  a  linear-area  layout.  Based  on  ideas  from 
Paterson,  Ituzzo,  and  Snyder  [23]  and  Bhatt  and  Leiserson  [4],  planar  embeddings  of  arbitrary 
trees  that  minimize  the  maximum  edge  length  were  given  by  Ruzzo  and  Snyder  [26]. 

Heretofore,  the  theoretical  work  on  layouts  has  assumed  that  the  entire  tree  fits  on  a  chip. 
But  the  tree  machines  discussed  above  might  be  much  larger.  Whenever  any  system  is  larger  than 
a  single  chip,  it  becomes  necessary  to  partition  it  among  separate  chips  which  can.be  assembled 
at  the  circuit  board  (or  chip  carrier)  level.  What  is  the  most  effective  way  to  partition  a  large 
tree  among  chips? 

This  question  is  pressing  because  although  integrated  circuit  technology  has  been  advancing 
at  a  breathtaking  pace,  one  sector  of  that  technology  has  been  crawling  in  comparison.  The 
technology  for  packaging  chips  severely  limits  the  number  of  external  connections  to  an  integrated 
circuit,  and  whereas  some  enthusiastic  technologists  project  an  eye-opening  108  components  per 
chip,  two  hundred  pins  per  chip  seems  a  large  number  to  most.  A  chip  that  requires  many  more 
is  unlikely  to  be  realizable  for  quite  some  time. 

Most  of  the  theoretical  work  on  tree  layout  has  also  implicitly  assumed  that  a  given  tree,  after 
masks  have  been  made  of  the  layout,  will  be  replicated  many  times.  This  assumption  is  implicit 
because  of  the  economics  of  integrated  circuit  fabrication  technology:  it  is  expensive  to  make  one 
chip,  but  cheap  to  make  many  copies.  For  this  economic  reason,  manufacturers  of  custom  chips 
have  been  encouraged  to  make  configurable  designs  such  as  gate-arrays,  ROM’s,  and  PLA’s.  The 
entire  chip  is  manufactured  except  for  one  mask.  The  customer  to  whom  the  chip  will  be  sold 
specifics  a  configuration  of  the  chip,  and  the  final  layer  of  metalization  connects  up  the  circuitry 
in  that  particular  way.  Thus  most  of  the  design  and  fabrication  costs  arc  factored  over  many 
custom  chip0.  Nevertheless,  many  copies  must  be  made  of  the  same  custom  chip  for  it  to  be 
economical. 

Reetructurable  integrated  circuits  provide  a  means  for  the  interconnections  on  a  chip  to  be 
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Figure  2:  An  O(nlgn)  layout  of  a  complete  binary  tree. 


configured  after  fabrication.  The  most  common  example  is  a  PROM  (programmable  read-only 
memory)  in  which  diodes,  which  normally  pass  current,  can  be  busted  so  that  a  connection  is 
no  longer  made.  More  recent  and  exciting  is  the  work  on  restructurable  VLSI  at  IHM  (20)  and 
MIT  Lincoln  Laboratory  [24).  Connections  between  two  metal  layers  are  produced  reliably  and 
efficiently  by  laser  welding.  Connections  can  also  be  broken  by  using  the  laser  to  cut  wires  in  the 
circuit.  Figure  3  shows  a  scanning  electron  microscope  photograph  of  laser  welds  and  cuts  on  a 
chip  at  MIT  Lincoln  Laboratory. 

Restructurable  VLSI  chips  have  the  advantage  that  the  cost  of  quantity-of-onc  designs  can  still 
be  factored  over  many  chips,  but  some  propose  systems  that  included  dynamically  restructurable 
interconnections.  For  example,  the  proposed  Cl  HI1  project  at  Purdue  (Snyder  (28])  is  a  dynamL 


Figure  3:  Laser  welds  and  cuts  on  a  restructurable  integrated  circuit 
chip  (courtesy  of  MIT  Lincoln  Laboratory). 

cally  restructurable  multiprocessor.  It  has  not  yet  been  demonstrated  that  large  scale  dynami¬ 
cally  restructurable  interconnections  are  economically  feasible  due  to  overheads  in  reliability, 
area,  performance,  and  fabrication  sophistication,  but  our  results  do  indeed  apply  to  dynamically 
restructurable  layouts. 

The  rest  of  this  paper  addresses  packaging  constraints  and  restructurable  VLSI  with  regard 
to  tree  layouts.  Section  2  gives  a  chip  with  four  pins  that  can  be  used  as  the  sole  building  block 
for  arbitrarily  large,  complete  binary  trees.  A  simple,  but  nonoptimal,  restructurable  layout  that 
can  implement  any  binary  tree  is  given  in  Section  3.  Section  4  proves  a  two-color  bisector  theorem 
for  trees  which  is  the  main  technical  tool  for  producing  the  restructurable  chip  given  in  Section 
5.  This  chip  of  M  vertices  has  linear  area  and  O(lgAf)  pins,  and  it  can  be  used  in  quantity  to 
assemble  any  binary  tree  of  any  size.  Section  6  contains  extensions  and  conclusions. 

2.  Packaging  a  complete  binary  tree 

This  section  studies  the  problem  of  packaging  complete  binary  trees,  and  presents  the  design  of 
a  single  chip  with  four  pins  that  can  be  used  to  bulid  arbitrarily  large  complete  binary  trees.  This 
chip,  originally  proposed  in  [17],  has  since  been  used  (at  the  circuit  board  level)  in  tree-machine 
projects  at  Caltech  and  Hell  laboratories  [7]. 

We  begin,  however,  by  examining  the  inefficient  partitioning  of  a  complete  binary  tree  proposed 
in  (15]  and  elsewhere  (for  example,  [6]).  Each  of  the  squares  in  Figure  4  is  a  Type  A  chip  and  is 
packed  as  full  as  possible  with  processors  in  the  H-troc  layout  of  Figure  1.  The  rectangle  above  is 
a  Type  2?  chip  which  contains  the  standard  0(n  logn)  area  layout  of  Figure  2,  but  with  each  leaf 
connected  off-chip.  The  Type  D  chip  can  be  used  repeatedly  to  combine  several  smaller  complete 
binary  trees  into  a  larger  one. 


Figure  4:  An  inefficient  partitioning  of  a  complete  binary  tree  into  Type 
A  and  Type  B  chips. 

Theorem  1.  Suppose  Type  A  chips  each  contain  P  =  2P  —  1  vertices,  and  Type  B  chips 
each  contain  Q  =  2*  —  1  vertices.  Then  a  complete  binary  tree  with  at  least  N  =  2n  —  1 
vertices  can  be  assembled  from 

•  7*+7  ^Vpe  ^  chips  and 

•  j  Twe  B  ekiP9- 

Proof.  The  complete  binary  tree  can  be  assembled  using  the  scheme  from  Figure  4.| 

We  can  do  better,  however.  Figure  5  shows  a  Type  Cchip  with  only  four  olT-chip  connections. 
Arbitrarily  large  complete  binary  trees  can  be  assembled  from  this  one  kind  of  chip.  Each  chip 
contains  one  internal  node  of  the  tree,  and  the  remainder  of  the  chip  is  packed  as  full  as  possible 
with  an  H-trec  layout.  The  internal  node  requires  three  o(T-chip  connections  (denoted  F,  R,  and 
L  in  the  figure)  for  its  father,  right  son,  and  left  son.  The  11-tree  requires  only  one  olT-chip 
connection  (denoted  T)  to  its  father. 

Theorem  2.  Suppose  Type  C  chips  each  contain  M  =  2m  vertices.  Then  a  complete  binary 
tree  with  at  least  N  =  2n  —  1  vertices  can  be  assembled  from  (N  +  1  )/M  Type  C  chips . 

Proof  We  show  how  arbitrarily  large  complete  binary  trees  can  be  built  up.  To  interconnect 
two  chips,  the  unconnected  internal  node  of  one  of  the  two  chips  is  selected  as  the  father  of  the 
two  H-trees.  In  Figure  6  the  internal  node  on  the  left  has  been  chosen  for  this  purpose.  The  R 
pin  on  this  chip  is  connected  to  its  own  T  pin,  and  the  L  pin  is  connected  to  the  T  pin  on  the 
other  chip.  Considered  as  a  unit,  the  combined  two  chips  now  have  the  same  structure  as  a  single 
chip  -  three  connections  to  an  internal  node  and  one  to  the  root  of  a  complete  binary  tree.  The 
pair  of  chips  can  be  similarly  combined  with  another  pair  to  produce  a  quadruple  of  chips,  which 
can  in  turn  be  combined,  and  so  forth  inductively,  as  is  shown  in  Figure  7.| 
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Tin*  method  lias  many  advantage  s  over  the*  two-chip  method.  Most  obviously,  I  In* 

one-chip  method  uses  only  one  kind  of  chip.  Why  manufacture  I  wo  kinds  whrn  one  will  do? 
Second,  only  four  data  paths  go  oil  chip.  Third,  the  TypeC  chip  is  pack**}  full,, while  the  Type  II 
ehip  is  almost  empty  because  it  is  pin  bound.  Finally,  thaarta  of  JheMMfnbly  (*n  a  cireuit  board 
for  example)  is  linear  in  the  number  Type  ('  chips  iis«*d.  Tbe  Iwn-rhip  solution  gives  an  O(nlogn) 
area  circuit  layout.  Althougli  t  he  ease*  is  not  particularly  strong  for  asymptotic  analysis  of  circuit 
layout,  the  constant  factors  give  a  clear  preference  to  the  more  regular,  linear  area  layout.  If 
circumstances  permit,  the  wires  connecting  the  chips  can  in  fact  be  routed  underneath  the  chips 
themselves,  thereby  requiring  no  more  area  on  the  circuit  board  than  the  chips  themselves. 

3.  A  restructurable  chip  for  packaging  arbitrary  trees 

This  section  presents  a  simple  (but  suboptiimd)  scheme  for  packaging  arbitrary  trees  using  a 
single  restructurable  chip.  The  solution  is  suggested  by  a  technique  of  Bentley  and  Leiserson  |I7] 
for  producing  collinear  layouts  for  arbitrary  trees.  The  strategy  for  producing  collincar  layouts 
is,  in  turn,  based  on  the  observation  that  trees  have  a  small  separator  theorem .  This  section 
defines  separator  theorems,  describes  the  strategy  for  producing  collinear  layouts,  and  proposes 
a  simple  packaging  scheme.  Although  the  solution  is  asymtotically  suboptima),  the  results  are 
crucial  to  the  optima)  scheme  presented  in  tbe  next  section. 

Separator  theorems  (19)  have  been  applied  to  solve  a  variety  of  graph-theoretic  problems 
including  graph  layout  (for  example,  [3,  14,  16,  17,  30]).  Formally,  let  ♦  be  a  family  of  graphs 
closed  under  the  subgraph  relation,  and  let  a  <  1/2  and  /3  be  positive  constants.  If  every  graph 
on  n  vertices  in  ♦  can  be  separated  into  two  disconnected  components,  each  having  at  least  [anj 
vertices,  by  removing  no  more  than  0/(«)  edges,  then  ♦  has  an  /(n)- separator  theorem . 

By  removing  a  single  edge,  any  n-vertex  binary  tree  can  be  separated  into  two  components, 
each  with  no  more  than  [§JVJ  +  1  vertices  [18].  (The  worst-case  occurs  for  the  four-vertex  tree 
in  which  one  vertex  is  adjacent  to  three  others.)  Either  of  the  two  components  may  be  a  forest, 
but  since  the  same  result  applies  to  forests,  the  binary  tree  can  be  split  recursively.  Since  each 
of  the  recursively  generated  subgraphs  can  be  split  by  removing  a  single  edge,  the  class  of  binary 
trees  has  a  one-separator  theorem . 

Bentley  and  Leiscrson  [16]  used  the  one-separator  theorem  for  trees  to  produce  collinear 
layouts  for  binary  trees.  In  a  collinear  layout  all  the  vertices  arc  placed  along  a  common  baseline, 
and  tree  edges  are  routed  along  horisontal  and  vertical  tracks  on  one  side  of  the  baseline,  as  seen 
in  Figure  8.  The  height  of  a  collinear  layout  is  defined  as  the  number  of  distinct  horisontal  tracks 
used  for  -outing  the  edges.  As  shown  in  the  following  theorem,  efficient  collinear  layouts  can  be 
produced  using  the  onc-separator  theorem  for  binary  trees.  (In  fact,  Yannakakis  [31]  has  shown 
that  a  minimum  height  layout  can  be  obtained  for  a  given  N-vertex  tree  in  0(N  Ig  N)  time.) 

Lemma  8.  Every  N-vertez  binary  tree  has  a  collinear  layout  with  height  no  freafer  than 

IgN. 

Proof \  Using  the  one-separator  theorem,  first  separate  the  tree.  If  either  component  contains 
more  than  N  f  2  vertices,  separate  it  into  two  smaller  components  using  the  one-separator  theorem 
again.  Next,  recursively  construct  collincar  layouts  for  each  subforest,  and  place  these  layouts 
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Figure  8:  The  construction  of  a  collincar  layout. 


side-by-side  along  the  baseline.  Finally,  as  shown  in  Figure  8,  connect  the  two  (or  throe)  subforests 
by  routing  the  separator  edges  on  distinct  vertical  tracks  and  along  a  common  horizontal  track. 
(For  two  components  this  is  trivial  since  only  edge  is  routed;  Tor  three  components,  place  the 
subforest  connected  to  both  other  subforcsts  in  the  middle  as  shown.)  For  each  node  there  are 
three  vertical  tracks  to  accomodate  edges  incident  to  that  node. 

The  height  or  the  layout  is  determined  by  a  simple  recurrence  relation.  Let  h(N)  be  the  height 
of  the  layout,  so  that  A(J)  =  0,  and  in  general, 

h(N)  <  h([N/2\)  + 1 . 

A  straightforward  calculation  yields  h(N)  <  Ig  N.  | 

Corollary  4.  Any  binary  tree  with  N  vertices  can  be  bisected  into  components  of  sizes 
|N/2}  and  [N/2]  fry  removing  at  most  IgTV  edges. 

Proof.  Consider  the  vertical  line  that  passes  midway  through  the  collincar  layout.  It  bisects 
the  N  vertices  and  the  number  of  edges  it  cuts  is  no  more  than  Ig/V,  the  height  of  the  layout.  | 

The  collinear  layout  can  also  be  used  to  make  a  configurable  chip  of  N  vertices  which  can 
realize  any  JV-vertex  binary  tree.  The  chip  consists  of  TV  collincar  vertices,  with  three  vertical 
wires  connected  to  each  vertex,  and  IgTV  contacts  along  each  vertical  wire.  Every  N- vertex 
binary  tree  can  be  configured  on  this  chip  by  specifying  one  extra  custom  layer.  The  custom 
layer  consists  of  the  portions  of  the  wires  in  the  collincar  layout  that  run  horizontally.  The 
horizontal  wires  run  between  the  rows  of  contacts,  and  spurs  to  the  contacts  make  connections. 

An  unattractive  feature  of  the  configurable  chip  is  that  a  different  mask  must  be  designed  for 
each  tree.  Not  surprisingly,  the  same  idea  can  be  used  to  design  a  restructurable  chip  for  trees, 
where  the  chip  is  customised  (for  example,  by  laser)  after  fabrication.  Once  again,  the  collincar 
layout  serves  as  the  I  asis  for  the  design.  The  restructurable  chip  consists  of  vertical  wires  running 
the  height  of  the  layout  on  one  layer,  and  horizontal  wires  running  the  width  of  the  layout  on 
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Figure  9:  A  Type  D  n structurabU  chip  which  can  bf  used  to  assemble 
large  binary  trees  by  making  and  breaking  connections. 


another.  By  using  laser  welds  lo  connect  various  horizontal  wires  to  appropriate  vertical  wires, 
and  laser  trimming  to  break  horizontal  wires,  any  tree  can  be  realized  in  accordance  with  its 
collincar  layout.  The  number  of  connections  made  or  broken  is  O(N). 

This  rcstructurable  layout  also  suggests  a  method  of  packaging  arbitrary  binary  trees  using  a 
single  Type  D  rcstructurable  chip,  which  is  shown  in  Figure  9.  From  each  of  the  collinear  vertices, 
three  vertical  wires  arc  run.  At  every  intersection  of  a  horizontal  and  vertical  wire  is  a  weld  point 
which  can  be  programmed  after  fabrication.  Each  horizontal  wire  is  connected  to  pins  at  either 
end. 


Theorem  5.  Suppose  Type  D  chips  each  contain  M  vertices  and  n  horizontal  wires.  Then 
any  binary  tree  with  N  =  2n  vertices  can  be  realized  with  \N /M]  Type  D  chips . 

Proof.  Take  the  \N j M ]  chips  and  place  them  side  by  side  in  the  natural  way  hooking  up 
adjacent  pins.  Following  Lemma  3,  draw  a  collinear  layout  of  height  at  most  lg  N  for  the  N-vertex 
tree.  Map  the  layout  onto  the  assembly  in  the  obvious  manner.  Make  and  break  connections  on 
each  chip  to  realize  the  layout.l 

Unfortunately,  if  a  tree  with  more  than  2n  vertices  were  required,  this  chip  might  not  be 
able  to  configure  it.  In  the  next  section  a  better  packaging  scheme  is  developed  whereby  one 
rcstructurable  chip  containing  M  vertices  in  linear  area  and  0(lg  M)  pins,  can  be  used  to  package 
arbitrarily  large  binary  trees. 

Some  rcstructurable  technologies  do  not  allow  connections  to  be  broken,  and  thus  the  scheme 
of  Theorem  5  will  not  work.  A  naive  alternative  is  to  break  every  horizontal  wire  into  M  unit 
length  segments.  Each  segment  can  be  connected  to  vertical  wires  and  to  its  neighboring  segments 
on  the  same  horizontal  track.  Unfortunately,  programming  the  interconnect  requires  a  large 
number  of  welds  to  be  made  on  an  edge  connecting  two  vertices.  The  scheme  from  Theorem  5 
requires  only  two  welds  for  each  edge. 

Figure  10  shows  a  Type  E  rcstructurable  chip  which  can  realize  any  tree  by  making,  but  not 
breaking,  connections  such  that  only  two  welds  arc  required  per  edge.  The  chip  has  M  =  2m 
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Figure  10:  A  Type  E  restructurable  chip  which  can  be  used  to  assemble 
large  binary  trees  without  breaking  connections. 


vertices  and  n  horizontal  tracks  which  arc  divided  into  groups.  The  first  group  contains  one 
horizontal  track  which  consists  of  M/2  unit  length  wire  segments.  The  second  group  contains 
two  horizontal  tracks,  each  with  M  /A  wire  segments  of  length  2.  In  general,  for  t  =  1,2, ...  ,m, 
the  ith  group  contains  i  tracks,  each  with  M /2%  wire  segments  of  length  2*.  The  remainder  of  the 
horizontal  tracks  are  in  group  m  +  1.  Each  of  these  tracks  has  one  wire  of  length  M  connected 
off  chip. 


Theorem  6.  Suppose  Type  E  chips  each  contain  M  =  2m  vertices  and  n  horizontal  tracks. 
Then  any  binary  tree  with  N  =  c2v/^*  vertices  can  be  realized  with  f N/M]  Type  E  chips, 
where  c  is  a  constant  (c  «  l/y/2). 

Proof.  Lay  the  \N/M]  chips  side  by  side,  and  connect  the  pins  to  continue  the  on-chip 
grouping  scheme  such  that  for  i  =  1, 2, . . . ,  Ig  AT,  group  t  contains  i  tracks,  each  with  N/2%  wire 
segments  of  length  2f.  The  total  number  of  horizontal  tracks  is 

fc(AT)  =  1  +  2-1 - +  IgAT 

=  iig;v(igN  +  i) 

<  j^lgc  +  >/2n)(lgc  +  y/2n  +  1^ 


1 


for  c  =  l/y/2,  and  thus  n  tracks  are  sufficient. 

Observe  that  this  assembly  without  its  top  group  of  )g  Ar  horizontal  wires  forms  two  smaller 
versions  of  itself.  To  realize  a  given  tree,  remove  the  lg  N  bisector  edges  as  in  Corollary  4, 
and  recursively  lay  out  the  equal  size  components  within  the  two  smaller  layouts.  Combine  the 
sublayouts  by  routing  the  bisector  edges  along  the  top  group  of  wires  that  run  across  the  layout. 
Since  two  connections  arc  lortned  for  each  tree  edge,  the  total  number  of  welds  is  2 N  —  2.| 


10 


Figure  11:  At  some  point,  a  window  of  size  n/2  slid  along  the  base  of 
the  two-color  collinear  layout  must  contain  half  the  white  and  half  the  black 
vertices. 


4.  Two-color  bisector  theorems 

Although  the  Type  D  rcstructurable  chip  with  M  vertices  and  2 n  pin  connections  provides 
one  way  to  package  large  trees,  it  suffers  two  disadvantages.  First,  it  cannot  be  used  to  assemble 
trees  with  more  than  2n  vertices.  Second,  and  more  important,  the  chip  is  wasteful  in  area.  In 
fact,  although  every  /V-vcrtex  tree  can  be  laid  out  in  O(jV)  area  (16,  30],  a  collinear  layout  for  the 
complete  binary  tree  requires  at  least  H(7V  Ig  N)  area  [5,  16].  Thus  we  are  led  to  ask:  Dots  there 
exist  a  restructurable  chip  with  M  vertices ,  occupying  O(M)  area ,  and  having  few  pins  which  can 
realize  every  binary  tree ,  no  matter  how  large  f 

In  the  next  section  we  answer  this  question  affirmatively.  The  question  is  fairly  subtle, 
however,  and  docs  not  follow  as  a  straightforward  application  of  the  separator  theorem.  While  we 
can  effectively  use  the  separator  theorem  to  recursively  bisect  a  tree  into  equal  size  components  (as 
in  Theorem  6),  there  is  nothing  to  bound  the  number  of  external  edges  that  connect  a  component 
to  the  rest  of  the  tree .  Thus  for  example,  suppose  we  designed  a  chip  with  M  vertices  and  P  pins 
for  packaging  arbitrarily  large  trees.  How  can  we  guarantee  that  every  tree  can  be  decomposed 
into  subgraphs  of  size  at  most  M  such  that  each  component  has  no  more  than  P  external  edges? 

In  this  section  we  introduce  the  notion  of  two-color  bisector  theorems  which  can  be  used  to 
recursively  bisect  a  graph  while  also  bounding  the  number  of  external  edges  into  each  component. 
Moreover,  trees  have  small  two-color  bisector  theorems,  so  that  the  number  of  external  edges  into 
a  component  is  also  small.  These  results  use  arguments  from  the  previous  section.  In  the  next 
section,  we  apply  two-color  bisector  theorems  to  design  an  optimal  packaging  scheme  for  binary 
trees. 

Definition.  Suppose  that  an  N -vertex  graph  G  has  b  black  vertices  and  w  white  vertices .  A 
two-color  bisector  for  G  is  a  set  of  edges  whose  removal  bisects  G  into  two  subgraphs  each  of 
size  at  least  [N/2\,  and  such  that  each  contains  at  least  |6/2J  black  and  [w/2\  white  vertices . 
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Figure  12:  To  keep  the  number  of  external  connections  to  all  subcorn 
ponents  small  when  a  component  is  bisected ,  the  external  connections  must 
be  evenly  divided  between  the  subcomponents. 


Theorem  7.  Every  N -vertex  forest  of  binary  trees  has  a  two  color  bisector  of  size  2  Ip  X . 

Proof.  Following  Lemma  3,  construct  a  collinear  layout  of  height  at  most  Ig/V.  Suppose 
there  are  b  black  vertices  and  N  ~  b  white  vertices.  Consider  a  “window"  which  overlaps  [N/2J 
consecutive  vertices,  and  place  it  over  the  leftmost  [N / 2J  vertices.  If  more  than  \bj 2]  black 
vertices  fall  within  the  window,  slide  the  window  one  position  to  the  right.  Observe  that  by 
sliding  the  window  one  position,  the  number  of  black  vertices  within  the  window  changes  by  at 
most  one.  Furthermore,  by  sliding  the  window  all  the  way  to  the  right,  less  than  [6/2J  black 
vertices  would  fall  within  the  window.  Consequently,  there  must  be  an  intermediate  placement 
of  the  window  (sec  Figure  11)  in  which  at  least  [fc/2J  black  vertices  and  at  least  [( /Vr  —  fc)/2J  white 
vertices  arc  contained  within  the  window.  (Such  a  placement  can  be  obtained  in  linear  time.) 

Draw  vcrCcal  lines  through  the  endpoints  of  the  window  in  the  position  obtained  above.  The 
edges  of  the  forest  intersecting  these  lines  form  a  two-color  bisector  of  the  forest.  The  size  of  this 
two-color  bisector  is  no  more  than  twice  the  height  of  the  layout.  Thus  the  site  of  the  two-color 
bisector  is  no  more  than  2lgN.| 

For  our  purposes  the  following  variant  of  two-color  bisectors  is  more  suitable.  Suppose  each 
vertex  of  an  JV-vcrtcx  forest  is  assigned  a  weight  from  a  bounded  set  {1,2,...,*}  of  weights.  We 
wish  to  bisect  the  forest  into  two  subforests,  each  of  size  at  least  [N/2J,  whose  total  weights 
differ  by  at  most  k.  How  many  edges  need  be  cut?  Adapting  the  argument  for  two-color  bisectors 
to  this  variant  in  a  straightforward  manner  shows  again  that  21g  N  cuts  suffice. 

Having  obtained  bounds  on  the  size  of  two-color  bisectors  for  forests,  we  wish  to  use  them 
for  partitioning  an  arbitrarily  large  binary  tree  into  subforcsts  of  size  at  most  M  so  that  every 
subforest  has  few  edges  connected  to  vertices  in  other  subforests.  This  result  is  established  in 
the  following  theorem. 


Theorem  8.  Every  N-vtrtez  binary  tret  ean  be  partitioned  into  \ N f  M ]  subforests,  each  of 
size  at  most  M ,  such  that  no  subforest  has  more  than  MgM  +  8  edges  connected  to  vertices 
tn  other  subfoe^nts. 


neral  case  ma>  In*  proved 
ti,  bisect  the  tree 111(41  (wo 
Iges.  Split  each  suhforest 
recursively  as  follows.  For  each  vertex  in  a  recursively  split  eotnponenl  or  size  tn  assign  a  weight 
equal  to  the  number  of  edges  incident  to  that  vertex  and  which  were  cut  at  a  previous  level.  Since 
the  degree  of  a  vertex  is  at  most  three,  the  weight  assigned  to  a  vertex  is  at  most  2.  From  the 
argument  following  Theorem  7,  there  is  a  weighted  bisector  of  size  no  greater  than  2lgm  for  the 
component.  This  weighted  bisector  divides  the  number  or  external  connections  almost  equally 
(the  diilerence  is  at  most  two)  between  the  subcomponents  of  sizes  |m/2j  and  frn/2].  As  seen  in 
Figure  12,  the  number  of  external  connections  into  either  of  the  new  subcomponents  is  no  more 
than  the  size  of  the  weighted  bisector  plus  one-half  the  number  of  external  ronnec lions  into  the 
component  just  split  (plus  two).  This  recursive  decomposition  terminates  when  each  component 
has  size  at  most  M.  Letting  £(tn)  be  the  number  of  external  connections  into  any  component  of 
size  m,  we  have  £ (/V)  =0,  and 

£ (m)  <  £  £ (2m)  +  2  lg(2m)  +  2  . 

A  little  calculation  shows  that  £ (m)  <  4  Ig  w  +  8.  This  means  that  every  subforcst  of  size  m  in  the 
recursive  decomposition  has  at  most  4lgm  +  8  external  edges  to  other  subforests.  Substituting 
M  for  m,  the  result  follows.  | 


I'rooj.  WV  prove  the  theorem  for  the  cum*  when  A  =.  2*  A/.  The  gc 
similarly,  hut  we  omit  the  tedious  details  of  the  analysis.  As  in  Theorem 
suhforests.  each  of  size  at  least  [jV/2J,  by  cutting  no  more  than  Ig  N  e« 
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Figure  13:  A  k-by-k  restructurable  permuter  can  realize  any  set  of  one- 
to-one  connections  between  the  terminals  on  the  two  sides. 

.  5.  An  optimal  packaging  scheme 

The  recursive  decomposition  of  Theorem  8  leads  directly  to  the  design  of  an  efficient  rcstruc- 
turablc  chip  which  can  be  used  in  quantity  to  assemble  any  tree.  This  Type  F  restructurable 
chip  has  M  vertices,  0(lg  M)  pins,  and  an  O(M)  area  layout.  This  packaging  scheme  is  the  best 
possible  when  all  vertices  on  the  chips  are  utilized. 

The  design  of  the  Type  V  chip  uses  restructurable  permuters.  A  permuter  Pk  has  k  terminals 
on  each  side  of  a  rectangle  and  can  realize  any  one-to-one  connection  between  the  terminals.  The 
switch  shown  in  Figure  13  implements  a  permuter.  It  has  dimensions  21b  X  kf  with  the  terminals 
along  the  longer  sides. 

The  construction  of  the  Type  F  restructurable  chip  is  recursive  and  follows  the  recursive 
decomposition  of  Theorem  8.  We  shall  use  Rm  to  denote  a  level  of  the  recursive  layout  with  m 
vertices,  and  let  Rm  denote  the  restructurable  Type  F  chip  of  M  vertices  itself.  Figure  H  shows 
how  the  Type  F  chip  Rm  is  constructed  from  Tour  copies  of  Rm/ 4,  four  copies  of  Pi\gMi  and 
two  copies  of  'PiigAf+j’  Letting  S(A/)  be  the  length  of  the  side  of  the  layout,  we  have  5(1)  =  1 
and, 

5(Af)  <  2S(M/4)  +  0(\g  M) , 

which  yields  S(Af)  =  0(y/M ),  so  that  the  area  is  linear  in  M .  The  number  of  pins  on  Rm  is 
41g  Af  +  8.  We  now  show  that  every  large  tree  can  be  assembled  using  Rm- 

Theorem  •.  Suppose  Type  F  chips  each  contain  M  vertices .  Then  any  N -vertex  binary 
tret  can  bt  assembled  using  \N/M]  Type  F  chips,  the  minimum  possible . 

Proof.  As  before,  we  assume  that  N  =  2*  A/,  although  the  result  extends  in  a  straightforward 
manner  to  the  funeral  case.  Following  Theorem  8,  decompose  the  tree  into  [TV/M]  components, 
each  of  size  at  mos;  M  and  having  no  more  than  i  lg  M  -f  8  external  edges  to  other  components. 
Each  of  the  \N/M]  components  can  be  realized  on  a  single  Type  F  chip  Rm -  To  see  this,  use 
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Figure  14:  The  Type  F  restructurable  chip  Rm  which  can  be  used  to 
assemble  arbitrarily  large  binary  trees. 


Theorem  8  to  recursively  decompose  each  component  into  single  vertices.  In  this  decomposition 
each  subforest  of  site  m  has  at  most  4lgm  +  8  external  edges.  This  decomposition  may  now 
be  mapped  directly  onto  the  chip,  using  the  permuters  to  route  edges  between  different  subcom¬ 
ponents.  Since  the  number  of  external  edges  at  any  level  is  no  greater  than  the  site  of  the 
permuters  at  that  level,  the  per  mu  tors  can  realize  the  desired  routing.  Vertices  of  the  tree  are 
embedded  at  fixed  positions  in  the  lowest  level  permuters  l\.  Finally,  each  chip  has  enough  pin 
connections  so  that  the  assembly  can  be  completed  ofT-chip  by  connecting  the  chips  together  as 
required  by  the  original  decomposition.  (Permuters  arc  not  needed  off  chip  because  wires  can  be 
routed  directly.)! 

The  constant  factors  on  area  can  be  improved  if  one  uses  the  smaller  restructurablc  permuter 
Pk  with  dimensions  (fc  +  0(\/ k))  X  {k  +  0(\/k))  that  follows  from  the  channel  routing  algorithm  of 
(1).  Whereas  the  simpler  permuter  from  Figure  13  requires  only  two  welds  to  make  a  connection, 
the  more  dense  layout  might  require  as  many  as  k  welds  for  each  connection.  Although  the  total 
number  of  welds  required  by  cither  scheme  is  0(M)f  the  number  per  wire  is  0(lg  M)  if  the  simpler 
switch  is  used  and  0(lg2  M )  if  the  channel-routing  permuter  is  used. 

In  related  work,  Rosenberg  [25]  has  also  considered  permuters  to  obtain  a  degree  of  configu¬ 
rability  in  layouts. 

6.  Extensions  and  conclusions 

All  the  layout  techniques  presented  here  extend  to  more  general  classes  of  graphs.  In  par¬ 
ticular,  the  techniques  extend  to  classes  of  graphs  not  closed  under  the  subgraph  relation  by 
extending  the  definition  of  separator  theorems  as  in  16  or  14  to  apply  recursively  to  graphs 
generated  by  the  separator.  For  example,  graphs  with  n°-separator  theorems  have  linear-area 
rcstructurable  layouts  if  a  <  1/2,  When  nr  =  1/2,  the  area  is  0(n  lg2  n),  and  if  a  >  1/2,  the  area 
is  0(n2°).  These  area  bounds  match  the  layout  areas  of  16  and  30  while  requiring  the  layouts  to 
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be  rest  rue  turn  hie.  In  each  caw*  the  number  of  pins  on  a  chip  is  f)(w°)  if  »  >  0,  ami  FJ(lg«)  if 
at  =  0. 

These  hounds  are  obtained  by  recursively  using  the  separator  theorem  to  produce  a  colli  near 
layout  and  then  chopping  the  layout  with  two  cuts  to  yield  a  two-color  bisector.  There  is  one 
technical  detail  in  using  the  extended  notions  of  separator  theorems  in  16  and  M  to  accomplish 
the  cuts  of  the  col  linear  layouts  since  wc  must  make  sure  that  the  two-color  bisector  theorem 
applies  recursively  to  the  two  halves  of  the  graph.  Rather  than  just  cutting  the  edges  incident  to 
the  two  vertical  lines,  one  must  in  addition  cut  a  constant  factor  more  edges  in  order  that  each  of 
the  subgraphs  generated  by  the  two-color  bisector  is  the  union  of  disjoint  subgraphs  generated  by 
the  separator  theorem.  A  more  general  dividc-and-conqucr  Framework  for  this  problem  is  given 
in  |3], 

The  methods  for  tree  assembly  considered  in  this  paper  have  all  assumed  that  the  overall 
utilization  of  the  chips  is  100  percent  specifically,  only  \N/M]  chips  are  used  to  assemble 
an  yV-vcrtex  tree  with  chips  that  hold  M  vertices.  Not  surprisingly,  if  the  assumption  of  full 
utilization  is  relaxed,  fewer  pins  are  needed.  In  particular,  we  can  guarantee  50  percent  utilization 
with  six-pin  chips  using  an  idea  due  to  Tom  Leighton. 

The  assembly  is  generated  recursively  as  in  Section  5.  At  each  step  of  the  divide- and-conquer 
construction,  there  is  a  subforest  A  with  at  most  six  external  connections.  This  subforest  can 
always  be  split  into  two  components,  each  containing  at  least  one-sixth  of  the  nodes  and  at  most 
six  external  connections.  We  first  use  the  standard  separator  theorem  to  remove  one  edge  that 
splits  A  into  two  components  B  and  C  with  at  worst  a  j  :  §  ratio.  The  only  case  to  worry  about 
is  if  all  the  original  external  connections  are  incident  to  /?  (or  to  C)  because  the  newly  removed 
edge  will  now  give  B  seven  external  connections.  If  this  bad  split  indeed  occurs,  wc  split  B  further 
into  By  and  B<i  so  that  the  seven  connections  are  divided  3:4.  (There  is  no  constraint  on  the 
ratio  of  the  size  of  B\  to  B^.)  Finally,  we  take  whichever  of  By  and  11%  is  smaller  and  combine  it 
with  C.  Of  the  two  remaining  components,  neither  has  more  than  six  external  connections,  and 
each  has  at  least  [|A|j/6  vertices. 

The  recursion  terminates  when  any  subforest  has  M  or  fewer  vertices,  in  which  case  the 
subforest  is  embedded  on  a  Type  F  chip.  Of  course,  only  six  of  the  O(lgAZ)  connections  are 
actually  used.  The  assembly  method  will  never  require  more  than  2f N  j(M  +  1)]  chips.  The 
worst  case  occurs  when  every  branch  of  the  recursion  terminates  with  the  splitting  of  a  subforest 
of  size  M  + 1.  Higher  utilization  can  be  attained  at  the  expense  of  more  pins  by  generalizing  this 
technique. 

Since  our  discovery  of  two-color  bisectors  and  their  relation  to  rcstructurable  layouts,  they 
have  been  used  in  other  VLSI  layout  problems.  Based  on  partial  knowledge  of  our  work,  Leighton 
12  showed  independently  that  any  graph  that  has  a  ^/n-separator  theorem  can  be  embedded  in  his 
“tree  of  meshes,”  which  is  similar  to  the  rcstructurable  layout  obtained  when  f(n)  =  y/n.  He  and 
Rosenberg  13  have  also  used  three-color  bisector  theorems  to  obtain  optimal  three-dimensional 
VLSI  layouts. 

The  use  of  the  collinoar  layout  for  obtaining  a  two-color  bisector  theorem  from  a  separator 
theorem  is  combinatorially  appealing,  and  can  be  recast  as  a  necklace  problem.  Given  a  necklace 
of  blacx  and  v.-LIt*  pearls,  how  many  cuts  arc  necessary  in  order  to  divide  the  necklace  into 
two  pieces  such  that  each  of  the  pieces  has  the  same  (to  within  one)  number  of  pearls  of  each 
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rolor'1  Tin*  obvious  extrusion  is  to  ;isk  how  m;my  ruts  are  iHN-essary  to  divide  nwklnee  of  k 
rotors.  Unfortunately,  the  itsiivc  idea  of  sliding  a  window  across  ihr  ro  Hi  near  layout,  fails  to  work 
if  k  >  3.  Recently,  Goldberg  and  Weal  II  at  Princeton,  hearing  or  our  open  problem,  developed 
an  elegant  topological  argument  to  show  that  k  cuts  8udieef  which  is  tight  in  that  k  cuts  are 
necessary  in  some  cases.  This  result  implies,  for  example,  that  trees  with  k  colors  have  O(itlgn) 
4r-roior  bisectors  and  planar  graphs  with  k  colors  have  0(ky/n)  lb-color  bisectors. 
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